Search CORE

77 research outputs found

Comparison of Resampling Schemes for Particle Filtering

Author: Cappé Olivier
Douc Randal
Moulines Eric
Publication venue
Publication date: 01/01/2005
Field of study

This contribution is devoted to the comparison of various resampling approaches that have been proposed in the literature on particle filtering. It is first shown using simple arguments that the so-called residual and stratified methods do yield an improvement over the basic multinomial resampling approach. A simple counter-example showing that this property does not hold true for systematic resampling is given. Finally, some results on the large-sample behavior of the simple bootstrap filter algorithm are given. In particular, a central limit theorem is established for the case where resampling is performed using the residual approach

arXiv.org e-Print Archive

HAL-Polytechnique

HAL UVSQ

Stochastic Bandit Models for Delayed Conversions

Author: Cappé Olivier
Perchet Vianney
Vernade Claire
Publication venue
Publication date: 12/07/2017
Field of study

Online advertising and product recommendation are important domains of applications for multi-armed bandit methods. In these fields, the reward that is immediately available is most often only a proxy for the actual outcome of interest, which we refer to as a conversion. For instance, in web advertising, clicks can be observed within a few seconds after an ad display but the corresponding sale --if any-- will take hours, if not days to happen. This paper proposes and investigates a new stochas-tic multi-armed bandit model in the framework proposed by Chapelle (2014) --based on empirical studies in the field of web advertising-- in which each action may trigger a future reward that will then happen with a stochas-tic delay. We assume that the probability of conversion associated with each action is unknown while the distribution of the conversion delay is known, distinguishing between the (idealized) case where the conversion events may be observed whatever their delay and the more realistic setting in which late conversions are censored. We provide performance lower bounds as well as two simple but efficient algorithms based on the UCB and KLUCB frameworks. The latter algorithm, which is preferable when conversion rates are low, is based on a Poissonization argument, of independent interest in other settings where aggregation of Bernoulli observations with different success probabilities is required.Comment: Conference on Uncertainty in Artificial Intelligence, Aug 2017, Sydney, Australi

arXiv.org e-Print Archive

Multiple-Play Bandits in the Position-Based Model

Author: Cappé Olivier
Lagrée Paul
Vernade Claire
Publication venue
Publication date: 05/01/2016
Field of study

Sequentially learning to place items in multi-position displays or lists is a task that can be cast into the multiple-play semi-bandit setting. However, a major concern in this context is when the system cannot decide whether the user feedback for each item is actually exploitable. Indeed, much of the content may have been simply ignored by the user. The present work proposes to exploit available information regarding the display position bias under the so-called Position-based click model (PBM). We first discuss how this model differs from the Cascade model and its variants considered in several recent works on multiple-play bandits. We then provide a novel regret lower bound for this model as well as computationally efficient algorithms that display good empirical and theoretical performance

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-Rennes 1

Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models

Author: Cappé Olivier
Douc Randal
Moulines Eric
Olsson Jimmy
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2008
Field of study

This paper concerns the use of sequential Monte Carlo methods (SMC) for smoothing in general state space models. A well-known problem when applying the standard SMC technique in the smoothing mode is that the resampling mechanism introduces degeneracy of the approximation in the path space. However, when performing maximum likelihood estimation via the EM algorithm, all functionals involved are of additive form for a large subclass of models. To cope with the problem in this case, a modification of the standard method (based on a technique proposed by Kitagawa and Sato) is suggested. Our algorithm relies on forgetting properties of the filtering dynamics and the quality of the estimates produced is investigated, both theoretically and via simulations.Comment: Published in at http://dx.doi.org/10.3150/07-BEJ6150 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

Lund University Publications

HAL-Polytechnique

HAL UVSQ

Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling

Author: Cappé Olivier
Lavergne Thomas
Sokolovska Nataliya
Yvon François
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/01/2010
Field of study

Conditional Random Fields (CRFs) constitute a popular and efficient approach for supervised sequence labelling. CRFs can cope with large description spaces and can integrate some form of structural dependency between labels. In this contribution, we address the issue of efficient feature selection for CRFs based on imposing sparsity through an L1 penalty. We first show how sparsity of the parameter set can be exploited to significantly speed up training and labelling. We then introduce coordinate descent parameter update schemes for CRFs with L1 regularization. We finally provide some empirical comparisons of the proposed approach with state-of-the-art CRF training strategies. In particular, it is shown that the proposed approach is able to take profit of the sparsity to speed up processing and hence potentially handle larger dimensional models

arXiv.org e-Print Archive

Crossref

Adaptive MCMC with online relabeling

Author: Bardenet Rémi
Cappé Olivier
Fort Gersende
Kégl Balázs
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 27/07/2015
Field of study

When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal distribution using an online estimate of the covariance matrix of the target, are no exception. To address the label-switching issue, relabeling algorithms associate a permutation to each MCMC sample, trying to obtain reasonable marginals. In the case of adaptive Metropolis (Bernoulli 7 (2001) 223-242), an online relabeling strategy is required. This paper is devoted to the AMOR algorithm, a provably consistent variant of AM that can cope with the label-switching problem. The idea is to nest relabeling steps within the MCMC algorithm based on the estimation of a single covariance matrix that is used both for adapting the covariance of the proposal distribution in the Metropolis algorithm step and for online relabeling. We compare the behavior of AMOR to similar relabeling methods. In the case of compactly supported target distributions, we prove a strong law of large numbers for AMOR and its ergodicity. These are the first results on the consistency of an online relabeling algorithm to our knowledge. The proof underlines latent relations between relabeling and vector quantization.Comment: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-IN2P3

HAL-Rennes 1